Clique counting in MapReduce: theory and experiments
نویسندگان
چکیده
We present exact and approximate MapReduce estimators for the number of cliques of size k in an undirected graph, for any small constant k ≥ 3. Besides theoretically analyzing our algorithms in the computational model for MapReduce introduced by Karloff, Suri, and Vassilvitskii, we present the results of extensive computational experiments on the Amazon EC2 platform. Our experiments show the practical effectiveness of our algorithms even on clusters of small/medium size, and suggest their scalability to larger clusters.
منابع مشابه
Scalable Scientific Computing Algorithms Using MapReduce
Cloud computing systems, like MapReduce and Pregel, provide a scalable and fault tolerant environment for running computations at massive scale. However, these systems are designed primarily for data intensive computational tasks, while a large class of problems in scientific computing and business analytics are computationally intensive (i.e., they require a lot of CPU in addition to I/O). In ...
متن کاملMining maximal cliques from a large graph using MapReduce: Tackling highly uneven subproblem sizes
We consider Maximal Clique Enumeration (MCE) from a large graph. A maximal clique is perhaps the most fundamental dense substructure in a graph, and MCE is an important tool to discover densely connected subgraphs, with numerous applications to data mining on web graphs, social networks, and biological networks. While effective sequential methods for MCE are known, scalable parallel methods for...
متن کاملLessons from the Congested Clique Applied to MapReduce
The main results of this paper are (I) a simulation algorithm which, under quite general constraints, transforms algorithms running on the Congested Clique into algorithms running in the MapReduce model, and (II) a distributed O(∆)-coloring algorithm running on the Congested Clique which has an expected running time of O(1) rounds, if ∆ ≥ Θ(log n); and O(log log log n) rounds otherwise. Applyin...
متن کاملExploiting Problem Structure for Solution Counting
This paper deals with the challenging problem of counting the number of solutions of a CSP, denoted #CSP. Recent progress have been made using search methods, such as BTD [15], which exploit the constraint graph structure in order to solve CSPs. We propose to adapt BTD for solving the #CSP problem. The resulting exact counting method has a worst-case time complexity exponential in a specific gr...
متن کاملSolution Counting for CSP and SAT with Large Tree-Width
This paper deals with the challenging problem of counting the number of solutions of a CSP, denoted #CSP. Recent progress has been made using search methods, such as Backtracking with Tree-Decomposition (BTD) [Jégou and Terrioux, 2003], which exploit the constraint graph structure in order to solve CSPs. We propose to adapt BTD for solving the #CSP problem. The resulting exact counting method h...
متن کامل